A Database for Measuring Linguistic Information Content

نویسندگان

  • Richard Sproat
  • Bruno Cartoni
  • HyunJeong Choe
  • David Huynh
  • Linne Ha
  • Ravindran Rajakumar
  • Evelyn Wenzel-Grondie
چکیده

Which languages convey the most information in a given amount of space? This is a question often asked of linguists, especially by engineers who often have some information theoretic measure of “information” in mind, but rarely define exactly how they would measure that information. The question is, in fact remarkably hard to answer, and many linguists consider it unanswerable. But it is a question that seems as if it ought to have an answer. If one had a database of close translations between a set of typologically diverse languages, with detailed marking of morphosyntactic and morphosemantic features, one could hope to quantify the differences between how these different languages convey information. Since no appropriate database exists we decided to construct one. The purpose of this paper is to present our work on the database, along with some preliminary results. We plan to release the dataset once

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Balanced Scorecard with Fuzzy Linguistic and Fuzzy Delphi Method for Evaluating Performance of Team Sports (SANAT NAFT NOVIN Abadan Football Club)

<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...

متن کامل

Integrating Balanced Scorecard with Fuzzy Linguistic and Fuzzy Delphi Method for Evaluating Performance of Team Sports (SANAT NAFT NOVIN Abadan Football Club)

<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...

متن کامل

Design and Implementation of a Comprehensive Database of the Written Heritage of Science and Technology

Purpose: This study aims to design and implement a comprehensive database of the written heritage of science and technology in the Regional Information Center for Science and Technology (RICeST) and determine the metadata elements required to describe the manuscripts. Method: This study was carried out by the content analysis method to identify the metadata elements needed to describe the coll...

متن کامل

Measuring Customer Acquisition Value: A Comprehensive Approach to Customer Equity

In information technology era, databases are known asone of the most valuable resources for organizations, especially usedin database marketing. Customer Equity is a key concept in DatabaseMarketing which integrates customer acquisition, retention and development.From the perspective of customer equity, customers are theprimary source of both current and future cash-flows. Customer equitymodels...

متن کامل

A Linguistic Analysis of the Online Debate on Vaccines and Use of Fora as Information Stations and Confirmation Niche

This study looks at the communication between users concerning health risks, with the aim of exploring their use of fora and assessing whether participants establish a niche with like-minded users during these exchanges. By integrating a corpus linguistic approach with content analysis and multiple studies on computer mediated health discourse, this study analyses the intense attention paid to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014